Optimal tests shrinking both means and variances applicable to microarray data analysis.

نویسندگان

  • J T Gene Hwang
  • Peng Liu
چکیده

As a consequence of the "large p small n" characteristic for microarray data, hypothesis tests based on individual genes often result in low average power. There are several proposed tests that attempt to improve power. Among these, the FS test that was developed using the concept of James-Stein shrinkage to estimate the variances showed a striking average power improvement. In this paper, we establish a framework in which we model the key parameters with a distribution to find an optimal Bayes test which we call the MAP test (where MAP stands for Maximum Average Power). Under this framework, the FS test can be derived as an empirical Bayes test approximating the MAP test corresponding to modeling the variances. By modeling both the means and the variances with a distribution, a MAP statistic is derived which is optimal in terms of average power but is computationally intensive. An empirical Bayes test called the FSS test is derived as an approximation to the MAP tests and can be computed instantaneously. The FSS statistic shrinks both the means and the variances and has numerically identical average power to the MAP tests. Much numerical evidence is presented in this paper that shows that the proposed test performs uniformly better in average power than the other tests in the literature, including the classical F test, the FS test, the test of Wright and Simon, the moderated t-test, SAM, Efron's t test, the B-statistic and Storey's optimal discovery procedure. A theory is established which indicates that the proposed test is optimal in power when controlling the false discovery rate (FDR).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Shrinkage Estimation of Variances With Applications to Microarray Data Analysis

Microarray technology allows a scientist to study genomewide patterns of gene expression. Thousands of individual genes are measured with a relatively small number of replications, which poses challenges to traditional statistical methods. In particular, the gene-specific estimators of variances are not reliable and gene-by-gene tests have low powers. In this article we propose a family of shri...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

Knowledge discovery in DNA microarray data of cancer patients with emergent self organizing maps

DNA microarrays provide a powerful means of monitoring thousands of gene expression levels at the same time. They consist of high dimensional data sets which challenge conventional clustering methods. The data’s high dimensionality calls for Self Organizing Maps (SOMs) to cluster DNA microarray data. This paper shows that a precise estimation of the variables’ variances is, however, the key to ...

متن کامل

Robust Semiparametric Optimal Testing Procedure for Multiple Normal Means

In high-dimensional gene expression experiments such as microarray and RNA-seq experiments, the number of measured variables is huge while the number of replicates is small. As a consequence, hypothesis testing is challenging because the power of tests can be very low after controlling multiple testing error. Optimal testing procedures with high average power while controlling false discovery r...

متن کامل

تحلیل تصاویر ریزآرایه به منظور تشخیص نوع سرطان سینه

Background: Microarray technology is a powerful tool to study and analyze the behavior of thousands of genes simultaneously. Images of microarray have an important role in the detection and treatment of diseases. The aim of this study is to provide an automatic method for the extraction and analysis of microarray images to detect cancerous diseases. Methods: The proposed system consists of t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Statistical applications in genetics and molecular biology

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2010